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Immunoglobulin Superfamily Domains and Fragments with Increased 

Solubility 



Field of the Invention 

The present invention relates to the modification of immunoglobulin superfamily (IgSF) 
domains and derivatives thereof so as to increase their solubility, and hence the yield, and 
ease of handling. 

Background to the Invention 

Small antibody fragments show exciting promise for use as therapeutic agents, diagnostic 
reagents, and for biochemical research. Thus, they are needed in large amounts, and the 
expression of antibody fragments, e.g. Fv, single-chain Fv (scFv), or Fab in the periplasm of 
£ coli (Skerra 6t Pluckthun 1988; Better et al. ( 1988) is now used routinely in many 
laboratories. Expression yields vary widely, however, especially in the case of scFvs. While 
some fragments yield up to several mg of functional, soluble protein per litre and OD of 
culture broth in shake flask culture (Carter et aU 1992, Pluckthun et ai. 1996), other 
fragments may almost exclusively lead to insoluble material, often found in so-called 
inclusion bodies. Functional protein may be obtained from the latter in modest yields by a 
laborious and time-consuming refolding process. The factors influencing antibody 
expression levels are still only poorly understood. Folding efficiency and stability of the 
antibody fragments, protease lability and toxicity of the expressed proteins to the host 
cells often severely limit actual production levels, and several attempts have been tried to 
increase expression yields. For example, Knappik a Pluckthun (1995) have identified key 
residues in the antibody framework which influence expression yields dramatically. 
Similarly, Ullrich et al. (1995) found that point mutations in the CDRs can increase the 
yields in periplasmic antibody fragment expression. Nevertheless, these strategies are only 
applicable to a few antibodies. 

The observations by Knappik Et Pluckthun (1995) indicate that optimizing those parts of 
the antibody fragment which are not directly involved in antigen recognition can 
significantly improve folding properties and production yields of recombinant Fv and scFv 
constructs. The causes for the improved expression behavior lie in the decreased 
aggregation behavior of these molecules. For other molecules, fragment stability and 
protease resistance may also be affected. The understanding of how specific sequence 
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modifications change these properties is still very limited and currently under active 
investigation. 

Difficulties in expressing and manipulating protein domains may arise because amino 
acids which are normally buried within the protein structure become exposed when only a 
portion of the whole molecule is expressed. Aggregation may occur through interaction of 
newly solvent-exposed hydrophobic residues originally forming the contact regions 
between adjacent domains. Leistlerand Perham (1994) could show that a certain domain 
of glutathione reductase may be expressed separately from its neighboring domains, but 
the protein showed non-specific association in vitro forming multimeric protein species. 
The introduction of hydrophilic residues instead of exposed hydrophobic amino acids 
could decrease this aggregation tendency and thus stabilize this isolated domain. Both 
wild type and modified domains were exclusively found in inclusion bodies and had to be 
refolded. Although in vitro experiments contributed a lot to define various intermolecular 
interactions, which drive folding processes, they are only of limited value in predicting the 
folding behaviour of different polypeptide chains in vivo (Gething 6t Sambrook, 1992). 
Thus, Leistler and Perham do not teach or suggest how to increase expression yields of 
soluble protein domains. 

In the case of antibodies, two chains comprising several domains dimerize, each domain 
consisting of a P-barrel whose two P-sheets are held together by a disulphide bond, 
forming the so-called immunoglobulin fold. Two domains, one variable domain (VL) and 
one constant domain (CL) are adjacent along the longitudinal axis in the light chain (VL- 
CL), and four domains, one variable domain (VH) and three constant domain (CH1 to CH3) 
are adjacent along the longitudinal axis in the heavy chain (VH-CH 1 -CH2-CH3). In the 
dimer formed by chains a and b, two such domains associate laterally: VL a with VH a) CL, 
with CH1 a , VL,, with VH b , CL, with CH1 b CH2 a with CH2 b and CH3 a with CH3 b . In WO 
92/01787 (Johnson et al., 1992), it is taught that isolated single domains, e.g. VH, can be 
modified in the former VL/VH interface region by exchanging hydrophobic residues by 
hydrophilic ones without changing the specificity of the parent domain. The rationale for 
WO 92/01787 was the assumption that exposed hydrophobic residues might lead to non- 
specific binding, interaction with surfaces and decreased stability. Data for increase in 
binding specificity was given, but increase in expression level was not shown. Furthermore, 
WO 92/01787 would not be applicable to any antibody fragment containing the complete 
antigen binding site, as it must contain VL and VH. 

The present inventors have found that expression problems are largely associated with a 
part of the molecule that has hitherto not been regarded relevant for expression studies 
and which comprises the interface between adjacent domains within an immunoglobulin 
chain. This surprising-finding forms the basis of the present invention, which provides a 
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general solution to the problems associated with production of domains or fragments of 
the immunoglobulin superfamiliy (IgSF), especially antibody fragments, which exhibit poor 
solubility or reduced levels of expression. 



Detailed Description of the Invention 

In addition to lateral interactions between domains of different chains described above, 
there are well documented contacts between adjacent domains within individual chains 
along the longitudinal axis For example, in the case of an antibody (Lesk 6t Chothia, 
1988). the "bottom" of VL makes contact with the "top" of CU and, in a similar manner 
there are contacts between VH and CH1. The contacts at these inter-domain interfaces are 
probably essential for the compact arrangement of the Fab fragment and, as is typical for 
such contacts, are at least partially hydrophobic in nature (Lesk ft Chothia, 1988). 

The basis of the present invention is the surprising finding that the solubility (and hence 
the yield) of antibody fragments comprising at least one domain can be dramatically 
increased by decreasing the hydrophobicity of former interfaces at the "end" of said 
domain, where it would normally adjoin a second domain within a chain in a larger 
antibody fragment or full antibody. This is surprising and could not have been predicted 
from the prior art (WO 92/01787), because the size of the longitudinal interface, for 
example, in a scFv fragment, is much smaller than that between VH and VL, and therefore, 
the amino acids which make up the interfaces between VH and CH1 or between VL and CL 
in a Fab fragment represent a much smaller proportion of the total surface area of the 
scFv molecule, and would accordingly be expected to play less of a role in determining the 
physical properties of the molecule. 

The present invention has the additional advantage that because the alterations effected 
in the molecules that lead to said decreased hydrophobicity of former interfaces are 
located at the most distant part of the domain from the CDRs, applying the invention is 
unlikely to have a deleterious effect on the binding properties of the molecule. This is not 
the case in WO 92/01787, where at least one modification is close to the CDRs and may 
therefore be expected to have an impact on antigen binding. Furthermore, WO 92/01787 
cannot be applied to VL/VH heterodimers, as explained above. 

The present invention relates to a modified immunoglobulin superfamily (IgSF) domain or 
fragment which differs from a parent IgSF domain or fragment in that the region which 
comprised or would comprise the interface with a second domain adjoined to said parent 
IgSF domain or fragment within the protein chain of a larger IgSF fragment or a full IgSF 



protein, and which is exposed in said parent IgSF domain or fragment in the absence of 
said second domain, is made more hydrophilic by modification. 

In the context of the present invention, the term immunoglobulin superfamily (IgSF) 
domain refers to those parts of members of the immunoglobulin superfamily which are 
characterized by the immunoglobulin fold, said superfamiliy comprising the 
immunoglobulins or antibodies, and various other proteins such as T-cell receptors or 
integrins. The term IgSF fragment refers to any portion of a member of the 
immunoglobulin superfamiliy. said portion comprising at least one IgSF domain. The term 
adjoining domain refers to a domain which is contiguous with a first domain. The term 
interface refers to a region of said first domain where interaction with the adjoining 
domain takes place. The terms hydrophobic and hydrophilic refer to a physical property of 
amino acids, which can be estimated quantitatively: tabulated values of hydrophobicity 
for the twenty naturally-occurring amino acids are available (Nozaki ft Tanford. 1971; 
Casari ft Sippl. 1992; Rose ft Wolfenden, 1993). 

The residues to be modified can be identified in a number of ways. For example, in one 
way. the solvent accessibilities (Lee Et Richards. 1971) of hydrophobic interface residues in. 
said parent IgSF fragment compared to said larger IgSF fragment or full IgSF protein are 
calculated, with high accessibilities indicating highly exposed residues. In a second way, 
the number of van der Waals contacts of hydrophobic interface residues in said larger 
IgSF fragment or full IgSF protein is calculated. A large number for a residue of said 
parent domain indicates that it will be highly solvent-exposed in the absence of an 
adjoining domain. There are other ways of calculating or determining residues to be 
modified according to the present invention, and one of ordinary skill in the art will be 
able to identify and practice these ways. 

The modification referred to above may be effected in a number of ways which are well 
known to one skilled in the art. In a preferred embodiment, the modification is a 
substitution of one or more amino acids at the exposed interface, identified as described 
above, with amino acids which are more hydrophilic. Alternatively, one or more amino 
acids can be inserted in said interface, or one or more amino acids can be deleted from 
said interface, so as to increase its overall hydrophilicity, Furthermore, any combination of 
substitution, insertion and deletion can be effected to reduce the hydrophobicity of said 
interface. Also comprised by the present invention is the possibility that the substitution or 
insertion comprises amino acids with a relatively high hydrophobicity value, or that the 
deletion comprises amino acids with relatively low hydrophobicity value, as long as the 
overall hydrophilicity value is increased in said interface region. Modifications such as 
substitution, insertion and deletion can be effected using standard methods which are well 
known to practitioners skilled in the art. By way of example, the skilled artisan can use 
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either site-directed or PCR-based mutagenesis (Ho et al., 1989; Kunkel et al., 1991 ; Trower, 
1994; Viville, 1994), or total gene synthesis (Prodromou a Pearl, 1992) to effect the 
necessary modifications). In a further embodiment, the mutations may be obtained by 
random mutagenesis and screening of random mutants, using a suitable expression and 
screening system (see, for example, Stemmer, 1994; Crameri et al. t 1996). 

In a preferred embodiment the amino acid(s) which replaced the more hydrophobic 
amino acids include Asn, Asp. Arg, Gin, GIu, Gly, His, Lys, Ser, and Thr. These are among the 
more hydrophilic of the 20 naturally-occurring amino acids, and have proven to be 
particularly effective in the application of the present invention. Said amino acids, alone 
or in combination, or in combination with other amino acids, can also be used to form the 
above mentioned insertion which makes the interface region more hydrophilic. 

The parent IgSF domain or fragment referred to above can be one of several different 
types. In a preferred embodiment, said parent domain or fragment is derived from an 
antibody. In one embodiment said parent antibody fragment comprises an Fv fragment. In 
this context the term Fv fragment refers to a complex comprising the VL (variable light) 
and VH (variable heavy) portions of the antibody molecule. In a further embodiment the 
parent antibody fragment may be a single-chain Fv fragment (scFv; Bird et al., 1988; 
Huston et al.. 1 988). in which the VL and VH chains are joined, in either a VL-VH, or VH-VL 
orientation, by a peptide linker. In yet a further embodiment, the parent antibody 
fragment may be an Fv fragment stabilized by an inter-domain disulphide bond. This is a 
structure which can be made by engineering into each chain a single cysteine residue, 
wherein said cysteine residues from two chains become linked through oxidation to form a 
disulphide (Glockshuber et aL, 1990; Brinkmann et al., 1993). 

In a most preferred embodiment, the interface region of the variable domains mentioned 
above comprises residues 9, 10, 12. 15, 39, 40, 41. 80. 81, 83. 103. 105, 106, 106A, 107, 108 
for VL, and residues 9, 10, 1 1, 13. 14, 41, 42, 43. 84, 87, 89, 105, 108, 110, 112, 113 for VH 
according to the Kabat numbering system (Kabat et al.. 1991). Said numbering system was 
established for the sequences of whole antibodies, but can be adapted correspondingly to 
describe the sequences of isolated antibody domains or antibody fragments, even in the 
case of scFv fragments, where VL and VH are connected via a peptide linker, and where the 
protein sequence from N- to C-terminus has to be numbered differently. This means that 
the Kabat numbering system is used in the present invention as a sequence description 
relative to the existing data on antibody sequences, not as an absolute description of 
actual positions within the antibody fragment sequences of interest 



In a further embodiment said parent antibody fragment comprises a Fab fragment. In this 
context the term Fab refers to a complex comprising the VL-CL (variable and constant 
light) and VH-CH1 (variable and first constant heavy) portions of the antibody molecule. 

In a still further embodiment said parent IgSF fragment is a fusion protein of any of said 
domains or fragments and another protein domain, derived from an antibody or any 
other protein or peptide. The advent of bacterial expression of antibody fragments has 
opened the way to the construction of proteins comprising fusions between antibody 
fragments and other molecules. A further embodiment of the present invention relates to 
such fusion proteins by providing for a DNA sequence which encodes both the modified 
IgSF domain or fragment as described above, as well as an additional moiety. Particularly 
preferred are moieties which have a useful therapeutic function. For example, the 
additional moiety may be a toxin molecule which is able to kill cells (Vitetta et al.. 1993). 
There are numerous examples of such toxins, well known those skilled in the art, such as 
the bacterial toxins Pseudomonas exotoxin A, and diphtheria toxin, as well as the plant 
toxins ricin, abrin, modeccin, saporin, and gelonin. By fusing such a toxin to an antibody 
fragment, the toxin can be targeted to, for example, diseased cells, and thereby have a 
beneficial therapeutic effect. Alternatively, the additional moiety may be a cytokine, such 
as IL-2 (Rosenberg £t Lotze, 1986), which has a particular effect (in this case a T-cell 
proliferative effect) on a family of cells. In a further embodiment the additional moiety 
may confer on its IgSF partner a means of detection and/or purification. For example, the 
fusion protein could comprise the modified IgSF domain or fragment and an enzyme 
commonly used for detection purposes, such as alkaline phosphatase (Blake et al., 1984). 
There are numerous other moieties which can be used as detection or purification tags, 
which are well known to the practitioner skilled in the art. Particularly preferred are 
peptides comprising at least five histidine residues (Hochuli et aL, 1988), which are able to 
bind to metal ions, and can therefore be used for the purification of the protein to which 
they are fused (Lindner et al., 1992). Also provided for by the invention are additional 
moieties such as the commonly used c-myc and FLAG tags (Hopp et al., 1988; Knappik Et 
Pluckthun, 1994). 

By engineering one or more fused additional domains, IgSF domains or fragments can be 
assembled into larger molecules which also fall under the scope of the present invention. 
To the extent that the physical properties of the IgSF domain or fragment determine the 
characteristics of the assembly, the present invention provides a means of increasing the 
solubility of such larger molecules. For example, mini-antibodies (Pack, 1994) are dimers 
comprising two antibody fragments, each fused to a self-associating dimerization 
domain. Dimerization domains which are particularly preferred include those derived from 
a leucine zipper (Pack 6t Pluckthun, 1992) or helix-turn-helix motif (Pack et aL, 1993). 
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All of the above embodiments of the present invention can be effected using standard 
techniques of molecular biology known to anyone skilled in the art. 

The compositions described above may have utility in any one of a number of settings. 
Particularly preferred are diagnostic and therapeutic compositions. 

The present invention also provides methods for making the compositions described above. 
Particularly preferred is a method comprising the following steps: 

i) analyzing the interface region of an IgSF domain for hydrophobic residues which are 
solvent-exposed using either a solvent-accessibility approach (Lee ft Richards, 1971), 
analysis of van der Waals interactions in the interface region, or similar methods which 
are well known to one skilled in the art, 

ii) identifying one or more of the hydrophobic residues to be exchanged by more 
hydrophilic residues, or identifying sites where hydrophilic residues or amino ac.d 
stretches enhancing the overall hydrophilicity of the interface region can be inserted, 
said hydrophilic residues preferentially but not exclusively taken from the list Asn, Asp. 
Arg, Gin, Glu, Gly, His, Lys, Ser, and Thr, 

iii) preparing DNA encoding mutants of said IgSF domain, characterized by the changes 
identified in ii), by using e.g. conventional mutagenesis or gene synthesis methods, said 
DNA being prepared either separately or as a mixture, 

iv) introducing said DNA or DNA mixture in a vector system suitable for expression of said 
mutants, 

v) introducing said vector system into suitable host cells and expressing said mutant or 
mixture of mutants, 

vi) identifying and characterizing mutants which are obtained in higher yield in soluble 
form, and 

vii) if necessary, repeating steps iii) to vi) to increase the hydrophilicity of said identified 
mutant or mutants further. 

The host referred to above may be any of a number commonly used in the production of 
heterologous proteins, including but not limited to bacteria, such as £ coli (Ge et al. 1995), 
or Bacillus subtilis (Wu et al.. 1993), fungi, such as yeasts (Horwitz et al., 1988; Ridder et 
al., 1995) or filamentous fungus (Nyyssonen et al, 1993). plant cells (Hiatt. 1990. H.att ft 
Ma. 1993; Whitelam et al.. 1994). insect cells (Potter et al.. 1993; Ward et al., 1995). or 
mammalian cells (Trill et al.. 1995). 



The invention is now demonstrated by the following examples, which are presented for 
illustration only and are not intended to limit the scope of the invention. 

Examples 

i) Abbreviations 

Abbreviations: CDR: complementarity determining region; dsFv: disulfide-linked Fv 
fragment; IMAC: immobilized metal ion affinity chromatography; IPTG: isopropyl-p-D- 
thiogalactopyranoside; i/s: ratio insoluble/soluble; H(X): heavy chain residue number X; 
L(X): light chain residue number X; NTA: nitrilo-triacetic acid; OD550: optical density at 
550 nm; PDB: protein database; scFv: single-chain Fv fragment; SDS-PAGE: sodium 
dodecyl sulfate polyacrylamide gel electrophoresis; v/c: variable/constant; wt: wild type 

ii) Material and Methods 

(a) Calculation of solvent accessibility 

Solvent accessible surface areas for 30 non-redundant Fab fragments and the Fv 
fragments derived from these by deleting the constant domain coordinates from the PDB 
file were calculated using the latest version, as of March 1996, of the program NACCESS 
(http://www.biochem.ucl.ac.uk/"roman/naccess/naccess) based on the algorhithm 
described by Lee ft Richards (1971). 



(b) scFv gene synthesis 

The single-chain Fv fragment (scFv) in the orientation V[_-linker-VH of the antibody 4-4- 
20 (Bedzyk et al., 1990) was obtained by gene synthesis (Prodromou and Pearl, 1992). The 
Vl domain carries a three-amino acid long FLAG tag (Knappik and Pluckthun, 1994). We 
have used two different linkers with a length of 15 (Gly4Ser)3 and 30 amino acids 
(Gly4Ser)6, respectively. The gene so obtained was cloned into a derivative of the vector 
plG6 (Ge et al., 1995). The mutant antibody fragments were constructed by site-directed 
mutagenesis (Kunkel et aL, 1987) using single-stranded DNA and up to three 
oligonucleotides per reaction. 
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(c) Expression 

Growth curves were obtained as follows: 20 ml of 2xYT medium containing 100 jig/ml 
ampicillin and 25 u.g/ml streptomycin were inoculated with 250 nl of an overnight 
culture of E. coli JM83 harboring the plasmid encoding the respective antibody fragment 
and incubated at 24.5°C until an OD 55 0 of 0.5 was reached. IPTG (Biomo. Feinchemika.ien 
GmbH) was added to a final concentration of 1 mM and incubation was continued for 3 
hours The 0D was measured every hour, as was the p-lactamase activity in the culture 
supernantant to quantify the degree of cel. leakiness. Three hours after induction an 
aliquot of the culture was removed and the cells were lysed exactly as descr.bed by 
Knappik and Pluckthun (1995). The p-lactamase activity was measured m the supernatant, 
in the insoluble and in the soluble fraction. The fractions were assayed for antibody 
fragments by reducing SDS-PAGE. with the samples normalized to OD and p-lactamase 
activity to account for possible plasmid loss as well as for cell leakiness. The gels were 
blotted and immunostained using the FLAG antibody M1 (Prickett et al.. 1989) as the first 
antibody, an Fc-specific anti-mouse antiserum conjugated to horseradish peroxidase 
(Pierce) as the second antibody, using a chemoluminescent detection assay descr.bed 
elsewhere (Ge et al, 1995). 



(d) Purification 



Mutant scFv fragments were purified by a two-column procedure. After French press lysis 
of the cells, the raw £ coli extract was first purified by IMAC (Ni-NTA superflow, Qiagen) 
(20 mM HEPES, 500 mM NaCI, pH 6.9; step gradient of imidazole 10. 50 and 200 mM) 
(Lindner et al., 1992) and. after dialyzing the IMAC eluate against 20 mM MES pH 6.0, 
finally purified by cation exchange chromatography (S-Sepharose fast flow column. 
Pharmacia) (20 mM MES, pH 6.0; salt gradient 0-500 mM NaCI). Purity was controlled by 
Coomassie stained SDS-PAGE. The functionality of the scFv was tested by competition 
ELISA. 

Because of its very poor solubility in the periplasmic system, the wt 4-4-20 was expressed 
as cytoplasmic inclusion bodies in the T7-based system (Studier Et Moffatt, 1986; Ge et al., 
1995). The refolding procedure was carried out as described elsewhere (Ge et al, 1995). For 
purification, the refolding solution (2 I) was loaded over 10 h without prior dialysis onto a 
fluorescein affinity column, followed by a washing step with 20 mM HEPES, 150 mM 
NaCI. pH 7.5. Two column volumes of 1 mM fluorescein (sodium salt, Sigma Chemicals Co.) 
pH 7.S were used to elute all functional scFv fragment. Extensive dialysis (7 days with 12 
buffer changes) was necessary to remove all fluorescein. All purified scFv fragments were 
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tested in gel filtration (Superose-12 column, Pharmacia SMART-System, 20 mM HEPES, 
150 mM NaCI, pH 7.5). 

(e) Kd determination by fluorescence titration 

The concentrations of the proteins were determined photometrically using an extinction 
coefficient calculated according to Gill and von Hippel (1989). Fluorescence titration 
experiments were carried out by taking advantage of the intensive fluorescence of 
fluorescein. Two ml of 20 mM HEPES. 150 mM NaCI, pH 7.5 containing 10 or 20 nM 
fluorescein were placed in a cuvette with integrated stirrer. The excitation wavelength was 
485 nm. emission spectra were recorded from 490 to 530 nm. Purified scFv (in 20 mM 
HEPES, 150 mM NaCI, pH 7.5) was added in 5 to 100 uJ aliquots. and after a 3 min 
equilibration time a spectrum was recorded. All spectra were recorded at 20°C. The 
maximum of emission at 510 nM was used for determining the degree of complexation 
of scFv to fluorescein, seen as quenching as a function of the concentration of the 
antibody fragment. The Kq value was determined by Scatchard analysis. 



(f) Equilibrium denaturation measurement 

Equilibrium denaturation curves were obtained by denaturation of 0.2 |iM protein in 
HEPES buffered saline (HBS) buffer (20 mM HEPES. 150 mM NaCI, 1 mM EDTA. pH 7.5) 
and increasing amounts of urea (1.0-7.5 M; 20 mM HEPES, 150 mM NaCI, pH 7.4; 0.25 
M steps) in a total volume of 1.7 ml. After incubating the samples at 10°C for 12 hours 
and an additional 3 hours at 20°C prior to measurements, the fluorescence spectra were 
recorded at 20°C from 320-360 nm with an excitation wavelength of 280 nm. The 
emission wavelength of the fluorescence peak shifted from 341 to 347 nm during 
denaturation and was used for determining the fraction of unfolded molecules. Curves 
were fitted according to Pace (1990). 



(g) Thermal denaturation 

For measuring the thermal denaturation rates, purified scFv was dissolved in 2 ml 
buffer to a final concentration of 0.5 U.M. The aggregation was followed for 2.5 h at 
and at 44°C by light scattering at 400 nm. 



-11- 'v:: '; , ' : 

f (• r (■ r • f 

re i < • . i 

iii) Results 



(a) Comparison of known antibody sequences 

Compared to other domain/domain interfaces in proteins, the interface between 
immunoglobulin variable and constant domains is not very tightly packed. A comparison 
of 30 non-redundant Fab structures in the PDB database showed that between the light 
chain variable and constant domain an area of 410 ± 90 A 2 per domain is buried, while 
the heavy chain variable and constant domains interact over an area of 710 ± 180 A 2 . 
Some, but not all of the interface residues are hydrophobic, predominantly aliphatic. 
Generally, sequence conservation of the residues contributing to the v/c domain interface 
is not particularly high. Still, the v/c domain interface shows up as a marked hydrophobic 
patch on the surface of an Fv fragment (Fig. 1). 

Solvent accessible surface areas for 30 non-redundant Fab fragments and their 
corresponding Fv fragments (derived from the Fab fragment by deleting the constant 
domain coordinates from the PDB file) were calculated using the program NACCESS (Lee 
a Richards, 1971). Residues participating in the v/c domain interface were identified by 
comparing the solvent-accessible surface area of each amino acid side chain in the 
context of an Fv fragment to its accessible surface in the context of an Fab fragment. 
Figure 2 shows a plot of the relative change in side chain accessibility upon deletion of 
the constant domains as a function of sequence position. Residues which show a 
significant reduction of side chain accessibility are also highlighted in the sequence 
alignment To assess sequence variability in the positions identified in Figure 2, the 
variable domain sequences collected in the Kabat database (status March 1996) were 
analyzed (Table 1). Of the 15 interface residues identified in the V|_ domain of the 
antibody 4-4-20 (Fig. 1 and Table 1), L9(leu), L12(pro), L15(leu). L40(pro). L83(leu), and 
L106(ile) are hydrophobic and therefore candidates for replacement. Of the 16 interface 
residues in the Vh domain, H1l(ieu), H14(pro), H41(pro), H84{vai), H87(met) and H89(ile) 
were identified as possible candidates for substitution by hydrophilic residues in the scFv 
fragment of the antibody 4-4-20 (Fig. 1 and Table 1). 

Not all of these hydrophobic residues are equally good candidates for replacements, 
however. While residues which are hydrophobic in one particular sequence but hydrophilic 
in many other sequences may appear most attractive, the conserved hydrophobic residues 
listed in Table 1 have also been investigated, since the evolutionary pressure which kept 
these conserved residues acted on the Fab fragment within the whole antibody, but not 
the isolated Fv portion. In this study, we did not replace the proline residues since pro L40 
and pro H41 form the hairpin turns at the bottom of the framework II region, while the 



conserved V L cis-pro.ine L8 and pro.ine residues H9 and H14 determine the shape of 
framework I of the immunoglobulin variable domains. 

other Vh domains frequently ala or ser), H87 Imet. usuany f „„ mpnt 
fluently val) in V H as possible candidates for replacement in the 4-4-20 scFv fragment. 

(b) Point mutations in the 4-4-20 scFv 

For the 4-4-20 scFv fragment some of the crucia. residues identified in the sequence 
an .ysl described above are already hydroohi.ic. but nevertheless 9 res, ues are of 
hydrophobic nature (including pr 0l 2 in the light chain) (Table ij. We chose three res.dues 

for closer analysis by mutations. 

Le u1 5 in V L is a hydrophobic amino acid in 98 <Vo of all kappa chains (Table !). Leu n is 
conserved in V H (Table 1) and is involved in v/c interdomain contacts (Lesk a Choth.a. 
1988). in contrast valine occurs very infrequently at position H84; mainly found at tt.s 
position are threonine or serine and alanine (Table 1). As can ^ seen m R gur e 1 . v I84 
contributing to a large hydrophobic patch at the newly exposed surface of V H . All three 
positions were mutated into acidic residues, and L11 was also changed to asparag.ne 
(Table 2). 

The scFv fragment was tested and expressed with two different linkers, the 15-mer linker 
G ySer)3 Huston et a... 1999 and the same motif extended to 30 ammo acd 
f G ,yt r) 6 All mutations were tested in both constructs. The *, wVo results of the d,fferent 
mutations on solubility were identical, and therefore only the results of the 30-mer Imker 
are described in more detail. The periplasmic expression experiments were earned out at 
24 5°C and all constructs were tested for soluble and insoluble prote.n by 
immunoblotting. The ratio of insoluble to soluble (i/s) protein was determined for every 
mutant. In Figure 3 A-D. insoluble (lane 1) and soluble (lane 2) fractions of the wt scFv are 
shown. Nearly no soluble material occurs in periplasmic expression, which ,s consistent 
with previous reports of Bedzyk et ... (1990) and Denzin et al. (1991). who desenbed 
earlier that the periplasmic expression of the wt scFv leads mainly to periplasm* .nclus.on 



bodies. 



The single point mutation L15E in V L (Bui) shows no effect on the ratio i/s when 
compared with the wt (Fig. 3A. lane 3. 4). Mutating leu at position 1 1 in the heavy cha.n 
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fold increase of soluble protein compared to the wt. 

compared with Flu3 (F,g. 3A. lane 7, 8) g ^ 
«; 6) and Flu4 (lane 3, 4 and 7. 8) is shown in both the 15 mer ana 
The single point mutation V84D turned out to be the protein with the best ,/s rat.o ,„ both 
constructs, with the 1 S-mer and the 30-mer linker scFv. 

(c) Functional expression and purification 

a^d the N terminus of V L than between C-terminus of V L and N-term.nus of V H (Huston 
et a.., 1995). Consequently, a linker of identical length may lead to different propert.es of 
the resulting molecules. 

Since we have chosen to use the minima, ^^T^S^^ 
at the N-terminus of V L in our constructs and thus the VL-hnker V H onentauo 
investigated the use of longer linkers. In the peripiasmic expression in E erf, no ^eren e 
between the 15-mer and the 30-mer linker in the correspond.ng mutants « v.s.ble ff£ 

discrepancy between the two constructs was found. The punficatton of the R* muta 
CV84D wi7h the 15-mer linker leads to very small amounts of part.ally punflc I prote 
Tb u 0 15 mg per liter and OD; estimated from SDS-PAGE after IMAC puntat** 
w eas l 30 mer linker construct gives about 0.3 mg per liter and OD of h.ghly pure 



functional protein. All mutants with 30-mer linker were tested in gel filtration and found 
to be monomerie (data not shown). 

For further in vitro characterization three mutants were purified with the 30-mer linker, 
V84D (Flu4), V84D/L1 1 D (Flu6) and L11D (Flu3). A two-step chromatography, first using 
IMAC and then cation-exchange chromatography, led to homogeneous protein. The i/s 
ratio of the antibody fragments (Fig. 3) is also reflected in the purification yield of 
functional protein. The highly soluble mutant Flu4 (V84D) (Fig. 3B lane 3, 4) yields about 
0.3 mg purified and functional protein per liter and OD, Flu6 (L11D/V84D) (Fig. 3B lane 7, 
8) yields about 0.25 mg per liter and OD and Flu3 (less soluble material on the blot in Fig. 
3A lane 7, 8) yields 0.05 mg per liter and OD.The wt scFv of the antibody 4-4-20 does not 
give any soluble protein at all in periplasmic expression with either linker, and it was 
therefore expressed as cytoplasmic inclusion bodies, followed by refolding in vitro and 
fluorescein affinity chromatography. The refolded wt scFv was shown by gel filtration to 
be monomerie with the 30-mer linker (data not shown). 

(d) Biophysical properties of the mutant scFvs 

Since we changed amino acids which are conserved, it cannot be excluded that changes at 
these positions may be transmitted through the structure and have an effect on the 
binding constant, even though they are very far from the binding site (Chatellier et al., 
1996). To eliminate this possibility, we determined the binding constant of the mutants 
Flu3, Flu4, Flu6 and the wtscFv. Fluorescence titration was used to determine Kq in 
solution by using the quenching of the intrinsic fluorescence of fluorescein when it binds 
to the antibody. The fluorescence quenching at 510 nm was measured as a function of 
added scFv. The Kd values (Table 3 and Fig. 4) obtained for ail three mutant scFvs and the 
wt scFv are very similar and correspond very well to the recently corrected Kd of the 
monoclonal antibody 4-4-20 (Miklasz et al., 1995). 

To determine whether the mutations had an influence on the thermodynamic stability of 
the protein we determined the equilibrium unfolding curves by urea denaturation. V84D 
mutant and the wt scFv were used for this analysis, and in Figure 5 an overlay plot is 
shown. The midpoint of both curves is at 4.1 M urea. Both curves were fitted by an 
algorhithm for a two-state model described by Pace (1990). but the apparent small 
difference between the V84D mutant and the wt scFv is not of statistical significance. 

Aggregation of folding intermediates could be another explanation for the different in 
vivo results between the mutant scFvs and the wt scFv (Fig.3). In the periplasm of £ coii. 
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the protein concentrations are assumed to be rather high (van Wie.ink ft Du.ne. 1990) and 
gregation effects could thus be pronounced, .n order to estimate the agg.gat.o 
oeha'or in vitro, we have measured the thermal aggregate rates at d.ffe ent 
Mature, .n Figure S it is dear, seen that the wt scFv is 
aire dy at 44°C. whereas the mutant V84D tends to aggregate more slowly. The wt cFv 

us clearly more aggregation prone than the mutant scFv. This is very sirmlar to the 
SLSoi made 1 different mutations on the antibody McPC603 (K = and 
Pluckthun, 1995). where no correlation was found between eou,., nun > 
curves and expression behavior, but a good correlation was found w.th the thermal 
aggregation rates. 
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Figures and Tables 

Figure V. Space-filing representation of the Fv fragment of the antibody 
4-4-20 color-coded for residue types. Orange: aromatic side chains (tyr. phe. trp); 
yellow aliphatic side chains (leu.ile.val.pro.ala); sulfur containing side-cha.ns (met. 
cvs)- green: uncharged, hydrophilic side chains (thr^er.asn.gln). red: acidic s.de cha.ns 
(glu'.as P ); blue: basic side chains (his. arg. lys); white: main-chain (hydrophob.cty 
color-code would be yellow-green) 

Figure 2- Variable/constant domain interface residues for V L (2a) and V H (2b). For 30 non- 
redundant Fab fragments taken from the Brookhaven Databank, the solvent 
accessible surface of the amino acid side chains was calculated in the context of an 
Fv and of an Fab fragment. The plot shows the relative reduction in accessible surface 
upon contact with the constant domains (color-code: 30 different colors for the 30 
Fv fragments). In the sequence alignment, residues contributing to the v/c interface 
are highlighted. The shading from white (< 1%) to red (> 80%) reflects the relafve 
reduction of solvent accessible surface upon removing the constant domams (color- 
code- white < 1%: yellow < 20%; yellow-orange < 40%; orange < 60%; red- 
orange < 80% and red > 80%). Circles indicate those positions which are further 
analyzed in Table 1. 

Figure 3- Western blots showing the insoluble (i) and soluble (s) fractions of cell extracts, 
prepared as described in Material and Methods, expressing the scFv fragments of the 
antibody 4-4-20. The amino acids substituted in the various mutants are g.ven m 
Table 2. 
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Fiqure 4- Scatchard plot of the fluorescence titration of fluorescein (20 nM) with antbody 
9 I to 800 nM), measured at 510 nm. The value r was obtamed from (F-F 0 )/(^ 0 ), 
her, f is the measured fluorescein fluorescence at a given antibody concentration. 
F^he the absence of antibody and ^when antibody ,s presen ,„ 

.arge excess. Note that r gives the saturation of fluorescein by antibody, (a) T.trat.on 
of wt scFv, (b) titration of Flu4 (V84D). 

Figure 5: An overlay plot of the urea denaturation curves is shown. (X) wt scFv, (o) F.u4. 

Figure 6: Therma, denaturation time courses at 40 and 44°C for wt and F,u4 scFv 
9 fragment are shown, (a) wt scFv at 40°C. (b) Flu4 at 40°C. (c) Flu4 at 44°C. (d) wt scFv 
at 44°C. 

Table 1 : Sequence variability of residues contributing to the v/c interface. 

Table 2: Mutations introduced in the scFv fragment of the antibody 4-4-20. 

Table 3: K„ values of the different scFv mutants determined in fluorescence titration. 
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Claims 

1 A DNA sequence which encodes an immunoglobulin superfamily (IgSF) domain which 
differs from a parent IgSF domain in that the region which comprised or would 
comprise the interface with a second domain adjoined to said parent IgSF doma.n 
within the chain of a larger IgSF fragment or protein is made more hydrophilic by 

modification. 

2 The DNA sequence according to claim 1 in which said modification is substitution of 
one or more amino acids at said interface with amino acids which are more 
hydrophilic. 

3 The DNA sequence according to claim 1 in which said modification is insertion of one 
or more hydrophilic amino acids in said interface, or insertion of amino acids wh.ch 
increase the overall hydrophilicity in said interface, or deletion of one or more 
hydrophobic amino acids in said interface, or deletion of amino acids, sa.d deletion 
leading to an increase in the overall hydrophilicity in said interface. 

4. The DNA sequence according to claim 1 in which said modification consists of any 
two or more of: 

a) substitution of one or more amino acids at said interface with amino acids 
which are more hydrophilic, 

b) insertion of one or more hydrophilic amino acids in said interface, or insertion 
of amino acids which increase the overall hydrophilicity in said interface, 

c) deletion of one or more hydrophobic amino acids in said interface, or deletion 
of amino acids, said deletion leading to an increase in the overall hydrophilicity in 
said interface. 

5. The DNA sequence according to any of claims 2 to 4 in which said substituted or 
inserted amino acid is taken from the list Asn. Asp, Arg, Gin. Glu. Gly. His, Lys. Ser, and 
Thr. 

6. The DNA sequence according to any of claims 1 to. 5 in which said parent IgSF 
domain is part of an IgSF fragment. 

7. The DNA sequence according to any of claims 1 to 6 in which said domain or 
fragment is derived from an antibody. 

8. The DNA sequence according to claim 7 in which said fragment is a Fab fragment. 



9. The DNA sequence according to claim 7 in which said fragment is an Fv fragment. 

10. The DNA sequence according to claim 7 in which said fragment is a scFv fragment 

11. The DNA sequence according to claim 7 in which said fragment is an Fv stabilized by 
an inter-domain disulphide bond. 

12. The DNA sequence according to any of claims 9 to 1 1 in which said interface region 
comprises residues 9 ( 10, 12, 15. 39. 40, 41. 80, 81, 83, 103. 105, 106. 106A, 107, 108 
for VL, and residues 9, 10, 1 1, 13. 14. 41, 42, 43, 84, 87,89, 105, 108, 1 10, 1 12, 1 13 for 
VH. 

13. The DNA sequence according to any of claims 1 to 12, having a contiguous sequence 
which encodes one or more additional moieties. 

14. The DNA sequence according to claim 13 in which at least one of said additional 
moieties is a toxin, a cytokine, or a reporter enzyme. 

15. The DNA sequence according to claim 13 in which at least one of said additional 
moieties is capable of binding a metal ion. 

16. The DNA sequence according to claim 15 in which at least one of said additional 
moieties comprises at least five histidines. 

17. The DNA sequence according to claim 13 in which said moiety is a peptide. 

18. The DNA sequence according to claim 17 in which said peptide is a labelling tag. 

19. The DNA sequence according to claim 18 in which said labelling tag is c-myc or FLAG. 

20. The DNA sequence according to claim 17 in which said peptide comprises an 
association domain which results in self-association of two or more of said antibody 
fragments. 

21. The DNA sequence according to claim 20 in which said association domain is derived 
from a leucine zipper or from a helix-turn-helix motif. 

22. The DNA sequence according to claim 17 in which said peptide comprises a first 
association domain which results in hetero-association of one or more of said 
antibody fragments with one or more peptides or proteins comprising a second 
hetero-association domain being able to associate with said first hetero-association 
domain. 
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23. A vector comprising a DNA sequence according to any of claims 1 to 22. 

24. A host cell comprising a vector according to claim 23. 

25 An IgSF domain or fragment, or a fusion protein comprising an IgSF domain or 
fragment, encoded by a DNA sequence according to any of claims 1 to 22. by a vector 
according to claim 23, or produced by a host cell according to claim 24. 

26. A diagnostic composition comprising an IgSF domain or fragment, or a fusion protein 
comprising an IgSF domain or fragment, according to claim 25. 

27. A therapeutic composition comprising an IgSF domain or fragment, or a fusion 
protein comprising an IgSF domain or fragment, according to claim 25. 

28. A method for deriving a DNA sequence according to any of claims 1 to 22 which 
comprises the following steps: 

i) analyzing the interface region of a parent IgSF domain for hydrophobic residues 
which are solvent-exposed, 

ii) identifying one or more of the hydrophobic residues to be substituted by more 
hydrophilic residues, or one or more positions where hydrophilic residues or ammo 
acid stretches enhancing the overall hydrophilicity of the interface reg.on can be 
inserted into said interface region, or one or more positions where hydrophob.c 
residues or amino acid stretches enhancing the overall hydrophobicity of the 
interface region can be deleted from said interface region, or any combinat.on of 
said substitutions, said insertions, and said deletions to give one or more mutants 
of said parent IgSF domain. 

29 A method for making an IgSF domain or fragment, or a fusion protein comprising an 
IgSF domain or fragment, according to claim 25 which comprises the following steps: 

i) deriving a DNA sequence according to claim 28, 

ii) preparing DNA encoding said mutant or mutants, said DNA being prepared 
either separately or as a mixture. 

iii) introducing said DNA or DNA mixture in a vector system suitable for expression 
of said mutant or mutants, said vector system optionally comprising one or more 
additional DNA sequences suitable for expression of additional IgSF domains or 
fragments, or one or more DNA sequences suitable for expression of a fus.on 
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protein comprising said mutant or mutants, or any combinatibn of said' additional 
DNA sequences. 

iv) introducing said vector system into suitable host cells and expressing said 
mutant or mixture of mutants, or expressing said mutants or mixture of mutants in 
combination with the expression products of said additional DNA sequences, 

v) identifying and characterizing one or more mutants, alone or in said 
combination, which are obtained in higher yield in soluble form, and 

vi) if necessary, repeating steps ii) to vi) to increase the hydrophilicity of said 
identified mutant or mutants, alone or in said combination, further. 

30. The method according to claim 29 in which said host is a bacterium, a fungus, a 
plant, an insect cell, or a cell-line of mammalian origin. 
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Summary 

The present invention relates to the modification of immunoglobulin superfamily 
(IgSF) domains. IgSF fragments and fusion proteins thereof, especially to the 
modification of antibody derivatives, so as to improve their solubility, and hence the 
yield and ease of handling. The inventors have found that this can be achieved by 
making the region which comprised the interface with domains adjoined to sa.d IgSF 
domain in a larger fragment or a full IgSF protein, and which becomes exposed .n the 
IgSF domain, more hydrophilic by modification. The present invention describes DNA 
sequences encoding modified IgSF domains or fragments and fusion prote.ns thereof, 
vectors and hosts containing these DNA sequences, IgSF domains or fragments or 
fusion proteins obtainable by expressing said DNA sequences in suitable expression 
systems, and a method for modifying IgSF domains, so as to improve their solubility, 
expressibility and ease of handling. 
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Fig. 2a: Variable/constant domain interface residues for VL 
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Fig. 2b: Variable/constant domain interface residues for VH 
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Fig. 3: Western blots showing the insoluble (i) and soluble (s) fractions of cell extracts. 
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Fig. 4: Scatchard plotsof fluorescence titration of fluorescein with antibody 
a: Titration of wt scFv 

a 1.4x10 7 -] ~ 
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Fig. 6: Thermal denaturation time courses at 40°C and 44°C for wt and Flu4 scFv fragment are 
shown, (a) wt scFv at 40°C, (b) Flu4 at 40°C t (c) Flu4 at 44°C. (d) wt scFv at 44*C. 
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Table 1 : Sequence Variability of Residues contributing to the v/c Interface 
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" H id L J r' li5t ^ V 5 ^ ? tK< VariaWe d0fnain se « ucnra in the database (March 199G). Sequence! which were <9Q% complete were 
excluded from the analysis. Number of sequences analyied: human VI kappa: 404 of 88!. murine VI kappa: 1061 of 2239. human VI lambda: 223 
of 409. murine v\ lambda: 71 or 206. human VH: 663 of I7S6. murine VH: 1294 of 3849 

^I"^^ SWW H ^X J MX0,-in , 9 *° " ,991 ' (F ' b} t0 ,hc re,J,ivc Jidc Ch,in «essibili«y in an Fab fragment « 

cakulated by the program NACCESS . %c*p fVnrfJ to the relative side chain accessibility in the isolated VL or VH domain %6u«ed to the relative 
d.fTerence in ,.de cha.n accessibility between Fv and Fab fragment. Con* refers lo the sequence consensus, and Oist. to the distribution of residue 
types 



■: Naccess v2.0 by Simon Hubbard (h tip 7/ www.biochem.ud.ac.uk/-roman/naccess/nacccss.html) 



Table 2: Mutations introduced in the scFv fragment of the antibody 4-4-20. 



L15E(Vl) L11N(Vh) L11D(Vh) V84D(Vh) 



Flu 1 
Flu 2 
Flu 3 
Flu 4 
Flu 5 
Flu 6 
Flu 7 
Flu 8 
Flu 9 

Flu 4 short 



Each line represents a different protein carrying the mutations indicated. The residues are 
numbered according to Kabat et al. (1991). 



Table 3: Kq values of the different scFv mutants determined in fluorescence titration. 





Flu wt 


Flu 3 


Flu 4 


Flu 6 


Flu wt# 


Kd (nM) 


80 ± 7 


60 ± 12 


70 ± 10 


75 ± 13 


90 



The Kd values are given in nM, the error was calculated from the Scatchard analysis (Fig. 5). 
# determined by Miklasz et al. (1995) 



